Towards a Lightweight RDMA Para-Virtualization for HPC

نویسندگان

  • Shiqing Fan
  • Fang Chen
  • Holm Rauchfuss
  • Nadav Har'El
  • Uwe Schilling
  • Nico Struckmann
چکیده

Virtualization has gained increasing attention in the recent High Performance Computing (HPC) development. While HPC provides scalability and computing performance, HPC in the cloud benefits in addition from the agility and flexibility that virtualization brings. One of the major challenges of HPC in virtualized environments is RDMA virtualization. Existing implementations of RDMA virtualization focused on supporting VMs running Linux. However, HPC workloads rarely need a full-blown Linux OS. Compared to traditional Linux OS, emerging Library OSes, such as OSv, are becoming popular choices as they provide efficient, portable and lightweight cloud images. To enable virtualized RDMA for lightweight library OSes, drivers and interfaces must be re-designed to accommodate the underlying virtual devices. In this paper we present a novel design, the virtiordma driver for OSv, which aims to provide RDMA paravirtualization for lightweight library OS. We compare this new design with existing implementations for Linux, and analyze the advantages of virtio-rdma’s architecture, its ease of migration to different operating systems, and the potential for performance improvement. We also propose a solution for integrating this para-virtualized driver into HPC platforms, enabling HPC application users to deploy their use cases smoothly in a virtualized HPC environment. ı̈ż£

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Smart HPC Interconnect for Clusters of Virtual Machines

In this paper, we present the design of a VM-aware, highperformance cluster interconnect architecture over 10Gbps Ethernet. Our framework provides a direct data path to the NIC for applications that run on VMs, leaving non-critical paths (such as control) to be handled by intermediate virtualization layers. As a result, we are able to multiplex and prioritize network access per VM. We evaluate ...

متن کامل

Cellule: Lightweight Execution Environment for Accelerator-based Systems

The increasing prevalence of accelerators is changing the high performance computing (HPC) landscape to one in which future platforms will consist of heterogeneous multi-core chips comprised of both general purpose and specialized cores. Coupled with this trend is increased support for virtualization, which can abstract underlying hardware to aid in dynamically managing its use by HPC applicati...

متن کامل

Contain This, Unleashing Docker for HPC

Containers are a lightweight virtualization method for running multiple isolated Linux systems under a common host operating system. Container-based computing is revolutionizing the way applications are developed and deployed. A new ecosystem has emerged around the Docker platform to enable container based computing. However, this revolution has yet to reach the HPC community. In this paper, we...

متن کامل

Portable, high-performance containers for HPC

Building and deploying software on high-end computing systems is a challenging task. High performance applications have to reliably run across multiple platforms and environments, and make use of site-specific resources while resolving complicated software-stack dependencies. Containers are a type of lightweight virtualization technology that attempt to solve this problem by packaging applicati...

متن کامل

Fault Tolerance for HPC with OpenVZ Virtualization by Lite Migration Toolkit

The reliability of large-scale parallel jobs within a cluster or even across multi-clusters under the Grid or distributed computing environment is a long term issue due to its difficulties involving the monitoring and managing of a large number of compute nodes. To contribute to the issue, a Lite Migration toolkit with fault tolerance feature has been developed by the Distributed Computing Team...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017